{
 "metadata": {
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.1-final"
  },
  "orig_nbformat": 2,
  "kernelspec": {
   "name": "Python 3.7.1 64-bit ('base': conda)",
   "display_name": "Python 3.7.1 64-bit ('base': conda)",
   "metadata": {
    "interpreter": {
     "hash": "2266c607543d224cb119288ea55888d6fda87cc9a4c78c02ed099d39082a76ce"
    }
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2,
 "cells": [
  {
   "source": [
    "![cover](https://user-images.githubusercontent.com/43134199/95018149-58cfc200-0690-11eb-9e64-760faec5130f.png)\n",
    "\n",
    "```bash\n",
    "pip install pdb-profiling\n",
    "```"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "source": [
    "## Load Modules"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "from tqdm import tqdm\n",
    "\n",
    "from pdb_profiling import default_config\n",
    "\n",
    "your_output_folder = \"C:/GitWorks/pdb-profiling/test/demo\"\n",
    "\n",
    "default_config(your_output_folder)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pdb_profiling.processors import SIFTS, SIFTSs, PDB, PDBs\n",
    "from pdb_profiling.utils import DisplayPDB"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "SIFTS.chain_filter, SIFTS.entry_filter = '', ''"
   ]
  },
  {
   "source": [
    "## Some knowledge you should know beforehand\n",
    "\n",
    "* [Guide to Understanding PDB Data](https://pdb101.rcsb.org/learn/guide-to-understanding-pdb-data/introduction)\n",
    "\n",
    "## Demo UniProt"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "demo = SIFTS('P21359')"
   ]
  },
  {
   "source": [
    "## Monomeric Protein\n",
    "\n",
    "* `pdb_id`: PDB Entry ID\n",
    "* `entity_id`: the entity identifier of a PDB Entity; for *what is PDB Entity?* please look at [this link](https://pdb101.rcsb.org/learn/guide-to-understanding-pdb-data/beginner%E2%80%99s-guide-to-pdb-structures-and-the-pdbx-mmcif-format)\n",
    "* `chain_id`: the chain identifier of a PDB Chain\n",
    "* `struct_asym_id`: the chain identifier of a PDB Chain (unique across all PDB Entity)\n",
    "* `Entry`: UniProt Entry ID\n",
    "* `UniProt`: UniProt Isoform ID\n",
    "* `is_canonical`: whether the UniProt Isoform is the canonical sequence of that UniProt Entry\n",
    "* `identity`: sequence identity between the corresponding PDB Chain's Sequence(complete SEQRES) and UniProt Isoform's Sequence \n",
    "* `unp_range`: mapped range of the UniProt Isoform's Sequence with its corresponding PDB Chain's Sequence (Index from 1)\n",
    "* `pdb_range`: mapped range of the PDB Chain's Sequence Sequence with its corresponding UniProt Isoform's (Index from 1)\n",
    "* `new_unp_range`: fixed(deal with InDel) mapped range of the UniProt Isoform's Sequence with its corresponding PDB Chain's Sequence (Index from 1)\n",
    "* `new_pdb_range`: fixed(deal with InDel) mapped range of the PDB Chain's Sequence Sequence with its corresponding UniProt Isoform's (Index from 1)\n",
    "* `conflict_pdb_range`: the chain's residue index of residue confilct with UniProt isoform sequence in the mapped range(Index from 1)\n",
    "* `select_tag`: whether in the recommanded representative set\n",
    "* `select_rank`: the rank among all the chains (1st denoted as the best)"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": [
      "Wall time: 1.07 s\n"
     ]
    },
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "   UniProt chain_id  entity_id  identity  is_canonical pdb_id struct_asym_id  \\\n",
       "0   P21359        A          1      0.94          True   1nf1              A   \n",
       "1   P21359        A          1      1.00          True   3pg7              A   \n",
       "2   P21359        B          1      1.00          True   3pg7              B   \n",
       "3   P21359        B          2      0.94          True   6v65              B   \n",
       "4   P21359        A          1      1.00          True   2d4q              A   \n",
       "5   P21359        B          1      1.00          True   2d4q              B   \n",
       "6   P21359        A          1      0.94          True   3peg              A   \n",
       "7   P21359        B          2      0.92          True   6ob3              B   \n",
       "8   P21359        D          2      0.92          True   6ob3              D   \n",
       "9   P21359        B          2      0.92          True   6ob2              B   \n",
       "10  P21359        D          2      0.92          True   6ob2              D   \n",
       "11  P21359        A          1      0.98          True   3p7z              A   \n",
       "12  P21359        B          1      0.98          True   3p7z              B   \n",
       "13  P21359        B          2      0.94          True   6v6f              B   \n",
       "14  P21359        A          1      0.99          True   2e2x              A   \n",
       "15  P21359        B          1      0.99          True   2e2x              B   \n",
       "\n",
       "              pdb_range                  unp_range   Entry  ... resolution  \\\n",
       "0             [[1,333]]              [[1198,1551]]  P21359  ...      2.500   \n",
       "1             [[1,256]]              [[1581,1837]]  P21359  ...      2.189   \n",
       "2             [[1,256]]              [[1581,1837]]  P21359  ...      2.189   \n",
       "3             [[2,329]]              [[1203,1551]]  P21359  ...      2.763   \n",
       "4             [[1,257]]              [[1581,1837]]  P21359  ...      2.300   \n",
       "5             [[1,257]]              [[1581,1837]]  P21359  ...      2.300   \n",
       "6   [[5,172],[174,290]]  [[1566,1733],[1721,1837]]  P21359  ...      2.524   \n",
       "7             [[2,256]]              [[1209,1484]]  P21359  ...      2.100   \n",
       "8             [[2,256]]              [[1209,1484]]  P21359  ...      2.100   \n",
       "9             [[2,256]]              [[1209,1484]]  P21359  ...      2.845   \n",
       "10            [[2,256]]              [[1209,1484]]  P21359  ...      2.845   \n",
       "11            [[5,276]]              [[1566,1837]]  P21359  ...      2.650   \n",
       "12            [[5,276]]              [[1566,1837]]  P21359  ...      2.650   \n",
       "13            [[2,329]]              [[1203,1551]]  P21359  ...      2.542   \n",
       "14            [[6,277]]              [[1566,1837]]  P21359  ...      2.500   \n",
       "15            [[6,277]]              [[1566,1837]]  P21359  ...      2.500   \n",
       "\n",
       "   experimental_method_class  experimental_method  multi_method  \\\n",
       "0                      x-ray    X-ray diffraction         False   \n",
       "1                      x-ray    X-ray diffraction         False   \n",
       "2                      x-ray    X-ray diffraction         False   \n",
       "3                      x-ray    X-ray diffraction         False   \n",
       "4                      x-ray    X-ray diffraction         False   \n",
       "5                      x-ray    X-ray diffraction         False   \n",
       "6                      x-ray    X-ray diffraction         False   \n",
       "7                      x-ray    X-ray diffraction         False   \n",
       "8                      x-ray    X-ray diffraction         False   \n",
       "9                      x-ray    X-ray diffraction         False   \n",
       "10                     x-ray    X-ray diffraction         False   \n",
       "11                     x-ray    X-ray diffraction         False   \n",
       "12                     x-ray    X-ray diffraction         False   \n",
       "13                     x-ray    X-ray diffraction         False   \n",
       "14                     x-ray    X-ray diffraction         False   \n",
       "15                     x-ray    X-ray diffraction         False   \n",
       "\n",
       "    revision_date deposition_date 1/resolution id_score  select_tag  \\\n",
       "0        20171004        19980708     0.400000      -65       False   \n",
       "1        20110713        20101031     0.456830      -65       False   \n",
       "2        20110713        20101031     0.456830      -66       False   \n",
       "3        20200805        20191204     0.361925      -66       False   \n",
       "4        20110713        20051022     0.434783      -65       False   \n",
       "5        20110713        20051022     0.434783      -66       False   \n",
       "6        20190717        20101026     0.396197      -65       False   \n",
       "7        20191113        20190319     0.476190      -66       False   \n",
       "8        20191113        20190319     0.476190      -68       False   \n",
       "9        20191113        20190319     0.351494      -66       False   \n",
       "10       20191113        20190319     0.351494      -68       False   \n",
       "11       20190717        20101013     0.377358      -65       False   \n",
       "12       20190717        20101013     0.377358      -66        True   \n",
       "13       20200805        20191205     0.393391      -66        True   \n",
       "14       20110713        20061118     0.400000      -65       False   \n",
       "15       20110713        20061118     0.400000      -66       False   \n",
       "\n",
       "    select_rank  \n",
       "0            16  \n",
       "1             5  \n",
       "2             6  \n",
       "3            11  \n",
       "4             2  \n",
       "5             4  \n",
       "6             9  \n",
       "7            14  \n",
       "8            13  \n",
       "9            15  \n",
       "10           12  \n",
       "11            3  \n",
       "12            1  \n",
       "13           10  \n",
       "14            7  \n",
       "15            8  \n",
       "\n",
       "[16 rows x 48 columns]"
      ],
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>UniProt</th>\n      <th>chain_id</th>\n      <th>entity_id</th>\n      <th>identity</th>\n      <th>is_canonical</th>\n      <th>pdb_id</th>\n      <th>struct_asym_id</th>\n      <th>pdb_range</th>\n      <th>unp_range</th>\n      <th>Entry</th>\n      <th>...</th>\n      <th>resolution</th>\n      <th>experimental_method_class</th>\n      <th>experimental_method</th>\n      <th>multi_method</th>\n      <th>revision_date</th>\n      <th>deposition_date</th>\n      <th>1/resolution</th>\n      <th>id_score</th>\n      <th>select_tag</th>\n      <th>select_rank</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>P21359</td>\n      <td>A</td>\n      <td>1</td>\n      <td>0.94</td>\n      <td>True</td>\n      <td>1nf1</td>\n      <td>A</td>\n      <td>[[1,333]]</td>\n      <td>[[1198,1551]]</td>\n      <td>P21359</td>\n      <td>...</td>\n      <td>2.500</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20171004</td>\n      <td>19980708</td>\n      <td>0.400000</td>\n      <td>-65</td>\n      <td>False</td>\n      <td>16</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>P21359</td>\n      <td>A</td>\n      <td>1</td>\n      <td>1.00</td>\n      <td>True</td>\n      <td>3pg7</td>\n      <td>A</td>\n      <td>[[1,256]]</td>\n      <td>[[1581,1837]]</td>\n      <td>P21359</td>\n      <td>...</td>\n      <td>2.189</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20110713</td>\n      <td>20101031</td>\n      <td>0.456830</td>\n      <td>-65</td>\n      <td>False</td>\n      <td>5</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>P21359</td>\n      <td>B</td>\n      <td>1</td>\n      <td>1.00</td>\n      <td>True</td>\n      <td>3pg7</td>\n      <td>B</td>\n      <td>[[1,256]]</td>\n      <td>[[1581,1837]]</td>\n      <td>P21359</td>\n      <td>...</td>\n      <td>2.189</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20110713</td>\n      <td>20101031</td>\n      <td>0.456830</td>\n      <td>-66</td>\n      <td>False</td>\n      <td>6</td>\n    </tr>\n    <tr>\n      <th>3</th>\n      <td>P21359</td>\n      <td>B</td>\n      <td>2</td>\n      <td>0.94</td>\n      <td>True</td>\n      <td>6v65</td>\n      <td>B</td>\n      <td>[[2,329]]</td>\n      <td>[[1203,1551]]</td>\n      <td>P21359</td>\n      <td>...</td>\n      <td>2.763</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20200805</td>\n      <td>20191204</td>\n      <td>0.361925</td>\n      <td>-66</td>\n      <td>False</td>\n      <td>11</td>\n    </tr>\n    <tr>\n      <th>4</th>\n      <td>P21359</td>\n      <td>A</td>\n      <td>1</td>\n      <td>1.00</td>\n      <td>True</td>\n      <td>2d4q</td>\n      <td>A</td>\n      <td>[[1,257]]</td>\n      <td>[[1581,1837]]</td>\n      <td>P21359</td>\n      <td>...</td>\n      <td>2.300</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20110713</td>\n      <td>20051022</td>\n      <td>0.434783</td>\n      <td>-65</td>\n      <td>False</td>\n      <td>2</td>\n    </tr>\n    <tr>\n      <th>5</th>\n      <td>P21359</td>\n      <td>B</td>\n      <td>1</td>\n      <td>1.00</td>\n      <td>True</td>\n      <td>2d4q</td>\n      <td>B</td>\n      <td>[[1,257]]</td>\n      <td>[[1581,1837]]</td>\n      <td>P21359</td>\n      <td>...</td>\n      <td>2.300</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20110713</td>\n      <td>20051022</td>\n      <td>0.434783</td>\n      <td>-66</td>\n      <td>False</td>\n      <td>4</td>\n    </tr>\n    <tr>\n      <th>6</th>\n      <td>P21359</td>\n      <td>A</td>\n      <td>1</td>\n      <td>0.94</td>\n      <td>True</td>\n      <td>3peg</td>\n      <td>A</td>\n      <td>[[5,172],[174,290]]</td>\n      <td>[[1566,1733],[1721,1837]]</td>\n      <td>P21359</td>\n      <td>...</td>\n      <td>2.524</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20190717</td>\n      <td>20101026</td>\n      <td>0.396197</td>\n      <td>-65</td>\n      <td>False</td>\n      <td>9</td>\n    </tr>\n    <tr>\n      <th>7</th>\n      <td>P21359</td>\n      <td>B</td>\n      <td>2</td>\n      <td>0.92</td>\n      <td>True</td>\n      <td>6ob3</td>\n      <td>B</td>\n      <td>[[2,256]]</td>\n      <td>[[1209,1484]]</td>\n      <td>P21359</td>\n      <td>...</td>\n      <td>2.100</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20191113</td>\n      <td>20190319</td>\n      <td>0.476190</td>\n      <td>-66</td>\n      <td>False</td>\n      <td>14</td>\n    </tr>\n    <tr>\n      <th>8</th>\n      <td>P21359</td>\n      <td>D</td>\n      <td>2</td>\n      <td>0.92</td>\n      <td>True</td>\n      <td>6ob3</td>\n      <td>D</td>\n      <td>[[2,256]]</td>\n      <td>[[1209,1484]]</td>\n      <td>P21359</td>\n      <td>...</td>\n      <td>2.100</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20191113</td>\n      <td>20190319</td>\n      <td>0.476190</td>\n      <td>-68</td>\n      <td>False</td>\n      <td>13</td>\n    </tr>\n    <tr>\n      <th>9</th>\n      <td>P21359</td>\n      <td>B</td>\n      <td>2</td>\n      <td>0.92</td>\n      <td>True</td>\n      <td>6ob2</td>\n      <td>B</td>\n      <td>[[2,256]]</td>\n      <td>[[1209,1484]]</td>\n      <td>P21359</td>\n      <td>...</td>\n      <td>2.845</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20191113</td>\n      <td>20190319</td>\n      <td>0.351494</td>\n      <td>-66</td>\n      <td>False</td>\n      <td>15</td>\n    </tr>\n    <tr>\n      <th>10</th>\n      <td>P21359</td>\n      <td>D</td>\n      <td>2</td>\n      <td>0.92</td>\n      <td>True</td>\n      <td>6ob2</td>\n      <td>D</td>\n      <td>[[2,256]]</td>\n      <td>[[1209,1484]]</td>\n      <td>P21359</td>\n      <td>...</td>\n      <td>2.845</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20191113</td>\n      <td>20190319</td>\n      <td>0.351494</td>\n      <td>-68</td>\n      <td>False</td>\n      <td>12</td>\n    </tr>\n    <tr>\n      <th>11</th>\n      <td>P21359</td>\n      <td>A</td>\n      <td>1</td>\n      <td>0.98</td>\n      <td>True</td>\n      <td>3p7z</td>\n      <td>A</td>\n      <td>[[5,276]]</td>\n      <td>[[1566,1837]]</td>\n      <td>P21359</td>\n      <td>...</td>\n      <td>2.650</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20190717</td>\n      <td>20101013</td>\n      <td>0.377358</td>\n      <td>-65</td>\n      <td>False</td>\n      <td>3</td>\n    </tr>\n    <tr>\n      <th>12</th>\n      <td>P21359</td>\n      <td>B</td>\n      <td>1</td>\n      <td>0.98</td>\n      <td>True</td>\n      <td>3p7z</td>\n      <td>B</td>\n      <td>[[5,276]]</td>\n      <td>[[1566,1837]]</td>\n      <td>P21359</td>\n      <td>...</td>\n      <td>2.650</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20190717</td>\n      <td>20101013</td>\n      <td>0.377358</td>\n      <td>-66</td>\n      <td>True</td>\n      <td>1</td>\n    </tr>\n    <tr>\n      <th>13</th>\n      <td>P21359</td>\n      <td>B</td>\n      <td>2</td>\n      <td>0.94</td>\n      <td>True</td>\n      <td>6v6f</td>\n      <td>B</td>\n      <td>[[2,329]]</td>\n      <td>[[1203,1551]]</td>\n      <td>P21359</td>\n      <td>...</td>\n      <td>2.542</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20200805</td>\n      <td>20191205</td>\n      <td>0.393391</td>\n      <td>-66</td>\n      <td>True</td>\n      <td>10</td>\n    </tr>\n    <tr>\n      <th>14</th>\n      <td>P21359</td>\n      <td>A</td>\n      <td>1</td>\n      <td>0.99</td>\n      <td>True</td>\n      <td>2e2x</td>\n      <td>A</td>\n      <td>[[6,277]]</td>\n      <td>[[1566,1837]]</td>\n      <td>P21359</td>\n      <td>...</td>\n      <td>2.500</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20110713</td>\n      <td>20061118</td>\n      <td>0.400000</td>\n      <td>-65</td>\n      <td>False</td>\n      <td>7</td>\n    </tr>\n    <tr>\n      <th>15</th>\n      <td>P21359</td>\n      <td>B</td>\n      <td>1</td>\n      <td>0.99</td>\n      <td>True</td>\n      <td>2e2x</td>\n      <td>B</td>\n      <td>[[6,277]]</td>\n      <td>[[1566,1837]]</td>\n      <td>P21359</td>\n      <td>...</td>\n      <td>2.500</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20110713</td>\n      <td>20061118</td>\n      <td>0.400000</td>\n      <td>-66</td>\n      <td>False</td>\n      <td>8</td>\n    </tr>\n  </tbody>\n</table>\n<p>16 rows × 48 columns</p>\n</div>"
     },
     "metadata": {},
     "execution_count": 5
    }
   ],
   "source": [
    "%time df1 = demo.pipe_select_mo().result()\n",
    "# NOTE: df1里的所有结果是该UniProt匹配上的所有PDB链\n",
    "df1\n",
    "# NOTE: 程序推荐的是select_tag为True的，根据一系列打分排名; 同时考虑了覆盖范围，尽多选择覆盖完该UniProt Isoform序列的结构"
   ]
  },
  {
   "source": [
    "## Prepare `Interactome3D` MetaData\n",
    "\n",
    "下面这里的代码只需要运行一次，以后再也不用运行，包括重启、重新打开代码文件时也是不用再运行；\n",
    "\n",
    "```py\n",
    "from pdb_profiling.processors.i3d.api import Interactome3D\n",
    "\n",
    "Interactome3D.pipe_init_interaction_meta().result()\n",
    "```"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "# NOTE: 如果未运行过如下代码，可以把#号注释去掉\n",
    "# Only need to run once\n",
    "# from pdb_profiling.processors.i3d.api import Interactome3D\n",
    "\n",
    "# Interactome3D.pipe_init_interaction_meta().result()"
   ]
  },
  {
   "source": [
    "## Homomeric Protein\n",
    "\n",
    "> Interaction metadata comes from `Interactome3D`\n",
    "\n",
    "* `_1`: denoted as the partner chain 1\n",
    "* `_2`: denoted as the partner chain 2\n",
    "* `assembly_id`\n",
    "    * 0 stands for asymmetric unit\n",
    "    * 1 stands for biological assembly 1 \n",
    "    * 2 stands for biological assembly 2\n",
    "    * and so on\n",
    "    * for *what is asymmetric unit & biological assembly?* please [click the link](https://pdb101.rcsb.org/learn/guide-to-understanding-pdb-data/biological-assemblies)\n",
    "* `model_id`: the model ID of the chain in the corresponding biological assembly PDB format file\n",
    "    * 0 denoted as the first model\n",
    "* `unp_range_DSC`: the Dice Similarity Coefficient of `new_unp_range_1` & `new_unp_range_2`\n",
    "* `interface_range_1`: the range of the interaction's interface in the aspect of partner1 chain (Index from 1)\n",
    "* `interface_range_2`: the range of the interaction's interface in the aspect of partner2 chain (Index from 1)\n",
    "* `unp_interface_range_1`: the range of the interaction's interface in the aspect of partner1 chain (mapped to the UniProt Isoforom)\n",
    "* `unp_interface_range_2`: the range of the interaction's interface in the aspect of partner2 chain (mapped to the UniProt Isoforom)\n",
    "* `i_select_tag`: whether in the recommanded interaction representative set\n",
    "* `i_select_rank`: the rank among all the interacting-chains (1st denoted as the best)"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "output_type": "stream",
     "name": "stderr",
     "text": [
      "100%|██████████| 10/10 [00:02<00:00,  3.38it/s]\n",
      "Wall time: 4.25 s\n"
     ]
    },
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "   entity_id_1 chain_id_1 struct_asym_id_1 struct_asym_id_in_assembly_1  \\\n",
       "0            1          A                A                            A   \n",
       "1            1          A                A                            A   \n",
       "2            1          A                A                            A   \n",
       "3            1          A                A                            A   \n",
       "4            1          A                A                            A   \n",
       "5            1          A                A                            A   \n",
       "6            1          A                A                            A   \n",
       "7            2          B                B                            B   \n",
       "8            2          B                B                            B   \n",
       "9            1          A                A                            A   \n",
       "\n",
       "   asym_id_rank_1  model_id_1 molecule_type_1  \\\n",
       "0               1           1  polypeptide(L)   \n",
       "1               1           1  polypeptide(L)   \n",
       "2               1           1  polypeptide(L)   \n",
       "3               1           1  polypeptide(L)   \n",
       "4               1           1  polypeptide(L)   \n",
       "5               1           1  polypeptide(L)   \n",
       "6               1           1  polypeptide(L)   \n",
       "7               1           1  polypeptide(L)   \n",
       "8               1           1  polypeptide(L)   \n",
       "9               1           1  polypeptide(L)   \n",
       "\n",
       "                                     surface_range_1  \\\n",
       "0  [[7,44],[46,47],[49,62],[64,82],[84,84],[86,25...   \n",
       "1  [[7,44],[46,47],[49,62],[64,82],[84,84],[86,25...   \n",
       "2  [[28,44],[47,48],[50,52],[54,62],[64,81],[83,8...   \n",
       "3  [[28,44],[47,48],[50,52],[54,62],[64,81],[83,8...   \n",
       "4  [[1,11],[13,25],[27,28],[30,42],[44,236],[238,...   \n",
       "5  [[1,11],[13,25],[27,28],[30,42],[44,236],[238,...   \n",
       "6              [[1,24],[28,235],[237,238],[240,256]]   \n",
       "7  [[12,48],[50,53],[55,73],[75,76],[78,80],[82,8...   \n",
       "8  [[12,48],[50,53],[55,76],[78,80],[82,85],[87,1...   \n",
       "9  [[24,30],[32,44],[46,61],[63,153],[177,269],[2...   \n",
       "\n",
       "                                   interface_range_1  entity_id_2  ...  \\\n",
       "0  [[50,51],[53,53],[86,86],[91,91],[167,168],[17...            1  ...   \n",
       "1  [[50,51],[53,53],[86,86],[91,91],[167,168],[17...            1  ...   \n",
       "2  [[51,52],[54,54],[87,87],[92,92],[168,169],[17...            1  ...   \n",
       "3  [[51,52],[54,54],[87,87],[92,92],[168,169],[17...            1  ...   \n",
       "4  [[31,32],[34,34],[67,67],[72,72],[148,149],[15...            1  ...   \n",
       "5  [[31,32],[34,34],[67,67],[72,72],[148,149],[15...            1  ...   \n",
       "6  [[31,32],[34,34],[67,67],[72,72],[148,149],[15...            1  ...   \n",
       "7  [[228,228],[231,232],[234,235],[237,238],[241,...            2  ...   \n",
       "8  [[125,125],[231,231],[234,235],[237,238],[241,...            2  ...   \n",
       "9  [[95,96],[99,100],[102,103],[107,107],[110,111...            1  ...   \n",
       "\n",
       "  id_score_2 select_tag_2 select_rank_2  in_i3d  best_select_rank  \\\n",
       "0        -66         True             1   False                 1   \n",
       "1        -66         True             1    True                 1   \n",
       "2        -66        False             8   False                 7   \n",
       "3        -66        False             8    True                 7   \n",
       "4        -66        False             4   False                 2   \n",
       "5        -66        False             4    True                 2   \n",
       "6        -66        False             6   False                 5   \n",
       "7        -68        False            13   False                13   \n",
       "8        -68        False            12   False                12   \n",
       "9        -65        False             9    True                 9   \n",
       "\n",
       "  unp_range_DSC                              unp_interface_range_1  \\\n",
       "0           1.0  ((1611, 1612), (1614, 1614), (1647, 1647), (16...   \n",
       "1           1.0  ((1611, 1612), (1614, 1614), (1647, 1647), (16...   \n",
       "2           1.0  ((1611, 1612), (1614, 1614), (1647, 1647), (16...   \n",
       "3           1.0  ((1611, 1612), (1614, 1614), (1647, 1647), (16...   \n",
       "4           1.0  ((1611, 1612), (1614, 1614), (1647, 1647), (16...   \n",
       "5           1.0  ((1611, 1612), (1614, 1614), (1647, 1647), (16...   \n",
       "6           1.0  ((1611, 1612), (1614, 1614), (1647, 1647), (16...   \n",
       "7           1.0  ((1456, 1456), (1459, 1460), (1462, 1463), (14...   \n",
       "8           1.0  ((1332, 1332), (1459, 1459), (1462, 1463), (14...   \n",
       "9           1.0  ((1656, 1657), (1660, 1661), (1663, 1664), (16...   \n",
       "\n",
       "                               unp_interface_range_2 i_select_tag  \\\n",
       "0  ((1611, 1612), (1614, 1614), (1647, 1647), (16...        False   \n",
       "1  ((1611, 1612), (1614, 1614), (1647, 1647), (16...        False   \n",
       "2  ((1611, 1612), (1614, 1614), (1647, 1647), (16...         True   \n",
       "3  ((1611, 1612), (1614, 1614), (1647, 1647), (16...        False   \n",
       "4  ((1590, 1590), (1611, 1612), (1614, 1614), (16...        False   \n",
       "5  ((1590, 1590), (1611, 1612), (1614, 1614), (16...        False   \n",
       "6  ((1611, 1612), (1614, 1614), (1647, 1647), (16...        False   \n",
       "7  ((1298, 1298), (1302, 1302), (1306, 1306), (14...         True   \n",
       "8  ((1306, 1306), (1422, 1422), (1425, 1425), (14...         True   \n",
       "9  ((1656, 1657), (1660, 1661), (1663, 1664), (16...         True   \n",
       "\n",
       "   i_select_rank  \n",
       "0              9  \n",
       "1             10  \n",
       "2              4  \n",
       "3              5  \n",
       "4              7  \n",
       "5              8  \n",
       "6              6  \n",
       "7              1  \n",
       "8              2  \n",
       "9              3  \n",
       "\n",
       "[10 rows x 115 columns]"
      ],
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>entity_id_1</th>\n      <th>chain_id_1</th>\n      <th>struct_asym_id_1</th>\n      <th>struct_asym_id_in_assembly_1</th>\n      <th>asym_id_rank_1</th>\n      <th>model_id_1</th>\n      <th>molecule_type_1</th>\n      <th>surface_range_1</th>\n      <th>interface_range_1</th>\n      <th>entity_id_2</th>\n      <th>...</th>\n      <th>id_score_2</th>\n      <th>select_tag_2</th>\n      <th>select_rank_2</th>\n      <th>in_i3d</th>\n      <th>best_select_rank</th>\n      <th>unp_range_DSC</th>\n      <th>unp_interface_range_1</th>\n      <th>unp_interface_range_2</th>\n      <th>i_select_tag</th>\n      <th>i_select_rank</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>1</td>\n      <td>A</td>\n      <td>A</td>\n      <td>A</td>\n      <td>1</td>\n      <td>1</td>\n      <td>polypeptide(L)</td>\n      <td>[[7,44],[46,47],[49,62],[64,82],[84,84],[86,25...</td>\n      <td>[[50,51],[53,53],[86,86],[91,91],[167,168],[17...</td>\n      <td>1</td>\n      <td>...</td>\n      <td>-66</td>\n      <td>True</td>\n      <td>1</td>\n      <td>False</td>\n      <td>1</td>\n      <td>1.0</td>\n      <td>((1611, 1612), (1614, 1614), (1647, 1647), (16...</td>\n      <td>((1611, 1612), (1614, 1614), (1647, 1647), (16...</td>\n      <td>False</td>\n      <td>9</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>1</td>\n      <td>A</td>\n      <td>A</td>\n      <td>A</td>\n      <td>1</td>\n      <td>1</td>\n      <td>polypeptide(L)</td>\n      <td>[[7,44],[46,47],[49,62],[64,82],[84,84],[86,25...</td>\n      <td>[[50,51],[53,53],[86,86],[91,91],[167,168],[17...</td>\n      <td>1</td>\n      <td>...</td>\n      <td>-66</td>\n      <td>True</td>\n      <td>1</td>\n      <td>True</td>\n      <td>1</td>\n      <td>1.0</td>\n      <td>((1611, 1612), (1614, 1614), (1647, 1647), (16...</td>\n      <td>((1611, 1612), (1614, 1614), (1647, 1647), (16...</td>\n      <td>False</td>\n      <td>10</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>1</td>\n      <td>A</td>\n      <td>A</td>\n      <td>A</td>\n      <td>1</td>\n      <td>1</td>\n      <td>polypeptide(L)</td>\n      <td>[[28,44],[47,48],[50,52],[54,62],[64,81],[83,8...</td>\n      <td>[[51,52],[54,54],[87,87],[92,92],[168,169],[17...</td>\n      <td>1</td>\n      <td>...</td>\n      <td>-66</td>\n      <td>False</td>\n      <td>8</td>\n      <td>False</td>\n      <td>7</td>\n      <td>1.0</td>\n      <td>((1611, 1612), (1614, 1614), (1647, 1647), (16...</td>\n      <td>((1611, 1612), (1614, 1614), (1647, 1647), (16...</td>\n      <td>True</td>\n      <td>4</td>\n    </tr>\n    <tr>\n      <th>3</th>\n      <td>1</td>\n      <td>A</td>\n      <td>A</td>\n      <td>A</td>\n      <td>1</td>\n      <td>1</td>\n      <td>polypeptide(L)</td>\n      <td>[[28,44],[47,48],[50,52],[54,62],[64,81],[83,8...</td>\n      <td>[[51,52],[54,54],[87,87],[92,92],[168,169],[17...</td>\n      <td>1</td>\n      <td>...</td>\n      <td>-66</td>\n      <td>False</td>\n      <td>8</td>\n      <td>True</td>\n      <td>7</td>\n      <td>1.0</td>\n      <td>((1611, 1612), (1614, 1614), (1647, 1647), (16...</td>\n      <td>((1611, 1612), (1614, 1614), (1647, 1647), (16...</td>\n      <td>False</td>\n      <td>5</td>\n    </tr>\n    <tr>\n      <th>4</th>\n      <td>1</td>\n      <td>A</td>\n      <td>A</td>\n      <td>A</td>\n      <td>1</td>\n      <td>1</td>\n      <td>polypeptide(L)</td>\n      <td>[[1,11],[13,25],[27,28],[30,42],[44,236],[238,...</td>\n      <td>[[31,32],[34,34],[67,67],[72,72],[148,149],[15...</td>\n      <td>1</td>\n      <td>...</td>\n      <td>-66</td>\n      <td>False</td>\n      <td>4</td>\n      <td>False</td>\n      <td>2</td>\n      <td>1.0</td>\n      <td>((1611, 1612), (1614, 1614), (1647, 1647), (16...</td>\n      <td>((1590, 1590), (1611, 1612), (1614, 1614), (16...</td>\n      <td>False</td>\n      <td>7</td>\n    </tr>\n    <tr>\n      <th>5</th>\n      <td>1</td>\n      <td>A</td>\n      <td>A</td>\n      <td>A</td>\n      <td>1</td>\n      <td>1</td>\n      <td>polypeptide(L)</td>\n      <td>[[1,11],[13,25],[27,28],[30,42],[44,236],[238,...</td>\n      <td>[[31,32],[34,34],[67,67],[72,72],[148,149],[15...</td>\n      <td>1</td>\n      <td>...</td>\n      <td>-66</td>\n      <td>False</td>\n      <td>4</td>\n      <td>True</td>\n      <td>2</td>\n      <td>1.0</td>\n      <td>((1611, 1612), (1614, 1614), (1647, 1647), (16...</td>\n      <td>((1590, 1590), (1611, 1612), (1614, 1614), (16...</td>\n      <td>False</td>\n      <td>8</td>\n    </tr>\n    <tr>\n      <th>6</th>\n      <td>1</td>\n      <td>A</td>\n      <td>A</td>\n      <td>A</td>\n      <td>1</td>\n      <td>1</td>\n      <td>polypeptide(L)</td>\n      <td>[[1,24],[28,235],[237,238],[240,256]]</td>\n      <td>[[31,32],[34,34],[67,67],[72,72],[148,149],[15...</td>\n      <td>1</td>\n      <td>...</td>\n      <td>-66</td>\n      <td>False</td>\n      <td>6</td>\n      <td>False</td>\n      <td>5</td>\n      <td>1.0</td>\n      <td>((1611, 1612), (1614, 1614), (1647, 1647), (16...</td>\n      <td>((1611, 1612), (1614, 1614), (1647, 1647), (16...</td>\n      <td>False</td>\n      <td>6</td>\n    </tr>\n    <tr>\n      <th>7</th>\n      <td>2</td>\n      <td>B</td>\n      <td>B</td>\n      <td>B</td>\n      <td>1</td>\n      <td>1</td>\n      <td>polypeptide(L)</td>\n      <td>[[12,48],[50,53],[55,73],[75,76],[78,80],[82,8...</td>\n      <td>[[228,228],[231,232],[234,235],[237,238],[241,...</td>\n      <td>2</td>\n      <td>...</td>\n      <td>-68</td>\n      <td>False</td>\n      <td>13</td>\n      <td>False</td>\n      <td>13</td>\n      <td>1.0</td>\n      <td>((1456, 1456), (1459, 1460), (1462, 1463), (14...</td>\n      <td>((1298, 1298), (1302, 1302), (1306, 1306), (14...</td>\n      <td>True</td>\n      <td>1</td>\n    </tr>\n    <tr>\n      <th>8</th>\n      <td>2</td>\n      <td>B</td>\n      <td>B</td>\n      <td>B</td>\n      <td>1</td>\n      <td>1</td>\n      <td>polypeptide(L)</td>\n      <td>[[12,48],[50,53],[55,76],[78,80],[82,85],[87,1...</td>\n      <td>[[125,125],[231,231],[234,235],[237,238],[241,...</td>\n      <td>2</td>\n      <td>...</td>\n      <td>-68</td>\n      <td>False</td>\n      <td>12</td>\n      <td>False</td>\n      <td>12</td>\n      <td>1.0</td>\n      <td>((1332, 1332), (1459, 1459), (1462, 1463), (14...</td>\n      <td>((1306, 1306), (1422, 1422), (1425, 1425), (14...</td>\n      <td>True</td>\n      <td>2</td>\n    </tr>\n    <tr>\n      <th>9</th>\n      <td>1</td>\n      <td>A</td>\n      <td>A</td>\n      <td>A</td>\n      <td>1</td>\n      <td>1</td>\n      <td>polypeptide(L)</td>\n      <td>[[24,30],[32,44],[46,61],[63,153],[177,269],[2...</td>\n      <td>[[95,96],[99,100],[102,103],[107,107],[110,111...</td>\n      <td>1</td>\n      <td>...</td>\n      <td>-65</td>\n      <td>False</td>\n      <td>9</td>\n      <td>True</td>\n      <td>9</td>\n      <td>1.0</td>\n      <td>((1656, 1657), (1660, 1661), (1663, 1664), (16...</td>\n      <td>((1656, 1657), (1660, 1661), (1663, 1664), (16...</td>\n      <td>True</td>\n      <td>3</td>\n    </tr>\n  </tbody>\n</table>\n<p>10 rows × 115 columns</p>\n</div>"
     },
     "metadata": {},
     "execution_count": 7
    }
   ],
   "source": [
    "%time df2 = demo.pipe_select_ho(run_as_completed=True, progress_bar=tqdm).result()\n",
    "# NOTE: df2里的所有结果是该UniProt匹配上的所有同聚体相互作用链；每一行就是一对相互作用 (同属于一个蛋白的两条链相互作用)；\n",
    "df2\n",
    "# NOTE: 程序推荐的是i_select_tag为True的，根据一系列打分排名; 同时考虑了interface覆盖范围，尽多选择各种类型同聚体相互作用结构"
   ]
  },
  {
   "source": [
    "## Heteromeric Protein\n",
    "\n",
    "> Interaction metadata comes from `Interactome3D`\n",
    "\n",
    "* `i_group`: the Heteromeric Interaction Group"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "output_type": "stream",
     "name": "stderr",
     "text": [
      "100%|██████████| 4/4 [00:01<00:00,  2.17it/s]\n",
      "Wall time: 6.39 s\n"
     ]
    },
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "     entity_id_1 chain_id_1 struct_asym_id_1 struct_asym_id_in_assembly_1  \\\n",
       "0              2          B                B                            B   \n",
       "1              2          B                B                            B   \n",
       "2              2          B                B                            B   \n",
       "3              2          B                B                            B   \n",
       "4              2          B                B                            B   \n",
       "..           ...        ...              ...                          ...   \n",
       "187            1          C                C                            C   \n",
       "188            1          C                C                            C   \n",
       "189            1          C                C                            C   \n",
       "190            1          C                C                            C   \n",
       "191            1          C                C                            C   \n",
       "\n",
       "     asym_id_rank_1  model_id_1 molecule_type_1  \\\n",
       "0                 1           1  polypeptide(L)   \n",
       "1                 1           1  polypeptide(L)   \n",
       "2                 1           1  polypeptide(L)   \n",
       "3                 1           1  polypeptide(L)   \n",
       "4                 1           1  polypeptide(L)   \n",
       "..              ...         ...             ...   \n",
       "187               1           1  polypeptide(L)   \n",
       "188               1           1  polypeptide(L)   \n",
       "189               1           1  polypeptide(L)   \n",
       "190               1           1  polypeptide(L)   \n",
       "191               1           1  polypeptide(L)   \n",
       "\n",
       "                                       surface_range_1  \\\n",
       "0    [[4,59],[61,83],[85,86],[88,91],[93,137],[139,...   \n",
       "1    [[4,59],[61,83],[85,86],[88,91],[93,137],[139,...   \n",
       "2    [[4,59],[61,83],[85,86],[88,91],[93,137],[139,...   \n",
       "3    [[4,59],[61,83],[85,86],[88,91],[93,137],[139,...   \n",
       "4    [[4,59],[61,83],[85,86],[88,91],[93,137],[139,...   \n",
       "..                                                 ...   \n",
       "187  [[2,20],[22,23],[25,51],[53,53],[55,55],[57,72...   \n",
       "188  [[2,20],[22,23],[25,51],[53,53],[55,55],[57,72...   \n",
       "189  [[2,20],[22,23],[25,51],[53,53],[55,55],[57,72...   \n",
       "190  [[2,20],[22,23],[25,51],[53,53],[55,55],[57,72...   \n",
       "191  [[2,20],[22,23],[25,51],[53,53],[55,55],[57,72...   \n",
       "\n",
       "                                     interface_range_1  entity_id_2  ...  \\\n",
       "0    [[32,33],[35,35],[71,72],[75,77],[81,82],[85,8...            3  ...   \n",
       "1    [[32,33],[35,35],[71,72],[75,77],[81,82],[85,8...            3  ...   \n",
       "2    [[32,33],[35,35],[71,72],[75,77],[81,82],[85,8...            3  ...   \n",
       "3    [[32,33],[35,35],[71,72],[75,77],[81,82],[85,8...            3  ...   \n",
       "4    [[32,33],[35,35],[71,72],[75,77],[81,82],[85,8...            3  ...   \n",
       "..                                                 ...          ...  ...   \n",
       "187  [[12,14],[18,18],[22,22],[26,26],[28,28],[30,3...            2  ...   \n",
       "188  [[12,14],[18,18],[22,22],[26,26],[28,28],[30,3...            2  ...   \n",
       "189  [[12,14],[18,18],[22,22],[26,26],[28,28],[30,3...            2  ...   \n",
       "190  [[12,14],[18,18],[22,22],[26,26],[28,28],[30,3...            2  ...   \n",
       "191  [[12,14],[18,18],[22,22],[26,26],[28,28],[30,3...            2  ...   \n",
       "\n",
       "    id_score_2 select_tag_2 select_rank_2  in_i3d  best_select_rank  \\\n",
       "0          -67        False             2   False                -1   \n",
       "1          -67        False             2   False                -1   \n",
       "2          -67        False             2   False                 2   \n",
       "3          -67        False             2   False                 2   \n",
       "4          -67        False             2   False                 2   \n",
       "..         ...          ...           ...     ...               ...   \n",
       "187        -68        False            10    True                 5   \n",
       "188        -68        False            -1    True                -1   \n",
       "189        -68        False             4    True                 4   \n",
       "190        -68        False            13    True                 5   \n",
       "191        -68        False            10    True                 5   \n",
       "\n",
       "                                 unp_interface_range_1  \\\n",
       "0    ((328, 329), (331, 331), (367, 368), (371, 373...   \n",
       "1    ((328, 329), (331, 331), (367, 368), (371, 373...   \n",
       "2    ((1233, 1234), (1236, 1236), (1272, 1273), (12...   \n",
       "3    ((1233, 1234), (1236, 1236), (1272, 1273), (12...   \n",
       "4    ((1233, 1234), (1236, 1236), (1272, 1273), (12...   \n",
       "..                                                 ...   \n",
       "187  ((11, 13), (17, 17), (21, 21), (25, 25), (27, ...   \n",
       "188  ((11, 13), (17, 17), (21, 21), (25, 25), (27, ...   \n",
       "189  ((11, 13), (17, 17), (21, 21), (25, 25), (27, ...   \n",
       "190  ((11, 13), (17, 17), (21, 21), (25, 25), (27, ...   \n",
       "191  ((11, 13), (17, 17), (21, 21), (25, 25), (27, ...   \n",
       "\n",
       "                                 unp_interface_range_2 i_select_tag  \\\n",
       "0    ((12, 12), (17, 17), (21, 21), (25, 25), (29, ...        False   \n",
       "1    ((12, 12), (17, 17), (21, 21), (25, 25), (29, ...        False   \n",
       "2    ((12, 12), (17, 17), (21, 21), (25, 25), (29, ...        False   \n",
       "3    ((12, 12), (17, 17), (21, 21), (25, 25), (29, ...        False   \n",
       "4    ((12, 12), (17, 17), (21, 21), (25, 25), (29, ...        False   \n",
       "..                                                 ...          ...   \n",
       "187  ((1233, 1234), (1236, 1237), (1272, 1278), (12...        False   \n",
       "188  ((328, 329), (331, 332), (367, 373), (377, 378...        False   \n",
       "189  ((1233, 1234), (1236, 1237), (1272, 1278), (12...        False   \n",
       "190  ((1233, 1234), (1236, 1237), (1272, 1278), (12...        False   \n",
       "191  ((1233, 1234), (1236, 1237), (1272, 1278), (12...        False   \n",
       "\n",
       "    i_select_rank               i_group  \n",
       "0              -1  (P01116-2, P21359-3)  \n",
       "1              -1    (P01116, P21359-3)  \n",
       "2              13  (P01116-2, P21359-2)  \n",
       "3              13    (P01116, P21359-2)  \n",
       "4              13  (P01116-2, P21359-6)  \n",
       "..            ...                   ...  \n",
       "187             8    (P01116, P21359-2)  \n",
       "188            -1    (P01116, P21359-5)  \n",
       "189             8    (P01116, P21359-4)  \n",
       "190             8      (P01116, P21359)  \n",
       "191             8    (P01116, P21359-6)  \n",
       "\n",
       "[192 rows x 119 columns]"
      ],
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>entity_id_1</th>\n      <th>chain_id_1</th>\n      <th>struct_asym_id_1</th>\n      <th>struct_asym_id_in_assembly_1</th>\n      <th>asym_id_rank_1</th>\n      <th>model_id_1</th>\n      <th>molecule_type_1</th>\n      <th>surface_range_1</th>\n      <th>interface_range_1</th>\n      <th>entity_id_2</th>\n      <th>...</th>\n      <th>id_score_2</th>\n      <th>select_tag_2</th>\n      <th>select_rank_2</th>\n      <th>in_i3d</th>\n      <th>best_select_rank</th>\n      <th>unp_interface_range_1</th>\n      <th>unp_interface_range_2</th>\n      <th>i_select_tag</th>\n      <th>i_select_rank</th>\n      <th>i_group</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>2</td>\n      <td>B</td>\n      <td>B</td>\n      <td>B</td>\n      <td>1</td>\n      <td>1</td>\n      <td>polypeptide(L)</td>\n      <td>[[4,59],[61,83],[85,86],[88,91],[93,137],[139,...</td>\n      <td>[[32,33],[35,35],[71,72],[75,77],[81,82],[85,8...</td>\n      <td>3</td>\n      <td>...</td>\n      <td>-67</td>\n      <td>False</td>\n      <td>2</td>\n      <td>False</td>\n      <td>-1</td>\n      <td>((328, 329), (331, 331), (367, 368), (371, 373...</td>\n      <td>((12, 12), (17, 17), (21, 21), (25, 25), (29, ...</td>\n      <td>False</td>\n      <td>-1</td>\n      <td>(P01116-2, P21359-3)</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>2</td>\n      <td>B</td>\n      <td>B</td>\n      <td>B</td>\n      <td>1</td>\n      <td>1</td>\n      <td>polypeptide(L)</td>\n      <td>[[4,59],[61,83],[85,86],[88,91],[93,137],[139,...</td>\n      <td>[[32,33],[35,35],[71,72],[75,77],[81,82],[85,8...</td>\n      <td>3</td>\n      <td>...</td>\n      <td>-67</td>\n      <td>False</td>\n      <td>2</td>\n      <td>False</td>\n      <td>-1</td>\n      <td>((328, 329), (331, 331), (367, 368), (371, 373...</td>\n      <td>((12, 12), (17, 17), (21, 21), (25, 25), (29, ...</td>\n      <td>False</td>\n      <td>-1</td>\n      <td>(P01116, P21359-3)</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>2</td>\n      <td>B</td>\n      <td>B</td>\n      <td>B</td>\n      <td>1</td>\n      <td>1</td>\n      <td>polypeptide(L)</td>\n      <td>[[4,59],[61,83],[85,86],[88,91],[93,137],[139,...</td>\n      <td>[[32,33],[35,35],[71,72],[75,77],[81,82],[85,8...</td>\n      <td>3</td>\n      <td>...</td>\n      <td>-67</td>\n      <td>False</td>\n      <td>2</td>\n      <td>False</td>\n      <td>2</td>\n      <td>((1233, 1234), (1236, 1236), (1272, 1273), (12...</td>\n      <td>((12, 12), (17, 17), (21, 21), (25, 25), (29, ...</td>\n      <td>False</td>\n      <td>13</td>\n      <td>(P01116-2, P21359-2)</td>\n    </tr>\n    <tr>\n      <th>3</th>\n      <td>2</td>\n      <td>B</td>\n      <td>B</td>\n      <td>B</td>\n      <td>1</td>\n      <td>1</td>\n      <td>polypeptide(L)</td>\n      <td>[[4,59],[61,83],[85,86],[88,91],[93,137],[139,...</td>\n      <td>[[32,33],[35,35],[71,72],[75,77],[81,82],[85,8...</td>\n      <td>3</td>\n      <td>...</td>\n      <td>-67</td>\n      <td>False</td>\n      <td>2</td>\n      <td>False</td>\n      <td>2</td>\n      <td>((1233, 1234), (1236, 1236), (1272, 1273), (12...</td>\n      <td>((12, 12), (17, 17), (21, 21), (25, 25), (29, ...</td>\n      <td>False</td>\n      <td>13</td>\n      <td>(P01116, P21359-2)</td>\n    </tr>\n    <tr>\n      <th>4</th>\n      <td>2</td>\n      <td>B</td>\n      <td>B</td>\n      <td>B</td>\n      <td>1</td>\n      <td>1</td>\n      <td>polypeptide(L)</td>\n      <td>[[4,59],[61,83],[85,86],[88,91],[93,137],[139,...</td>\n      <td>[[32,33],[35,35],[71,72],[75,77],[81,82],[85,8...</td>\n      <td>3</td>\n      <td>...</td>\n      <td>-67</td>\n      <td>False</td>\n      <td>2</td>\n      <td>False</td>\n      <td>2</td>\n      <td>((1233, 1234), (1236, 1236), (1272, 1273), (12...</td>\n      <td>((12, 12), (17, 17), (21, 21), (25, 25), (29, ...</td>\n      <td>False</td>\n      <td>13</td>\n      <td>(P01116-2, P21359-6)</td>\n    </tr>\n    <tr>\n      <th>...</th>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n    </tr>\n    <tr>\n      <th>187</th>\n      <td>1</td>\n      <td>C</td>\n      <td>C</td>\n      <td>C</td>\n      <td>1</td>\n      <td>1</td>\n      <td>polypeptide(L)</td>\n      <td>[[2,20],[22,23],[25,51],[53,53],[55,55],[57,72...</td>\n      <td>[[12,14],[18,18],[22,22],[26,26],[28,28],[30,3...</td>\n      <td>2</td>\n      <td>...</td>\n      <td>-68</td>\n      <td>False</td>\n      <td>10</td>\n      <td>True</td>\n      <td>5</td>\n      <td>((11, 13), (17, 17), (21, 21), (25, 25), (27, ...</td>\n      <td>((1233, 1234), (1236, 1237), (1272, 1278), (12...</td>\n      <td>False</td>\n      <td>8</td>\n      <td>(P01116, P21359-2)</td>\n    </tr>\n    <tr>\n      <th>188</th>\n      <td>1</td>\n      <td>C</td>\n      <td>C</td>\n      <td>C</td>\n      <td>1</td>\n      <td>1</td>\n      <td>polypeptide(L)</td>\n      <td>[[2,20],[22,23],[25,51],[53,53],[55,55],[57,72...</td>\n      <td>[[12,14],[18,18],[22,22],[26,26],[28,28],[30,3...</td>\n      <td>2</td>\n      <td>...</td>\n      <td>-68</td>\n      <td>False</td>\n      <td>-1</td>\n      <td>True</td>\n      <td>-1</td>\n      <td>((11, 13), (17, 17), (21, 21), (25, 25), (27, ...</td>\n      <td>((328, 329), (331, 332), (367, 373), (377, 378...</td>\n      <td>False</td>\n      <td>-1</td>\n      <td>(P01116, P21359-5)</td>\n    </tr>\n    <tr>\n      <th>189</th>\n      <td>1</td>\n      <td>C</td>\n      <td>C</td>\n      <td>C</td>\n      <td>1</td>\n      <td>1</td>\n      <td>polypeptide(L)</td>\n      <td>[[2,20],[22,23],[25,51],[53,53],[55,55],[57,72...</td>\n      <td>[[12,14],[18,18],[22,22],[26,26],[28,28],[30,3...</td>\n      <td>2</td>\n      <td>...</td>\n      <td>-68</td>\n      <td>False</td>\n      <td>4</td>\n      <td>True</td>\n      <td>4</td>\n      <td>((11, 13), (17, 17), (21, 21), (25, 25), (27, ...</td>\n      <td>((1233, 1234), (1236, 1237), (1272, 1278), (12...</td>\n      <td>False</td>\n      <td>8</td>\n      <td>(P01116, P21359-4)</td>\n    </tr>\n    <tr>\n      <th>190</th>\n      <td>1</td>\n      <td>C</td>\n      <td>C</td>\n      <td>C</td>\n      <td>1</td>\n      <td>1</td>\n      <td>polypeptide(L)</td>\n      <td>[[2,20],[22,23],[25,51],[53,53],[55,55],[57,72...</td>\n      <td>[[12,14],[18,18],[22,22],[26,26],[28,28],[30,3...</td>\n      <td>2</td>\n      <td>...</td>\n      <td>-68</td>\n      <td>False</td>\n      <td>13</td>\n      <td>True</td>\n      <td>5</td>\n      <td>((11, 13), (17, 17), (21, 21), (25, 25), (27, ...</td>\n      <td>((1233, 1234), (1236, 1237), (1272, 1278), (12...</td>\n      <td>False</td>\n      <td>8</td>\n      <td>(P01116, P21359)</td>\n    </tr>\n    <tr>\n      <th>191</th>\n      <td>1</td>\n      <td>C</td>\n      <td>C</td>\n      <td>C</td>\n      <td>1</td>\n      <td>1</td>\n      <td>polypeptide(L)</td>\n      <td>[[2,20],[22,23],[25,51],[53,53],[55,55],[57,72...</td>\n      <td>[[12,14],[18,18],[22,22],[26,26],[28,28],[30,3...</td>\n      <td>2</td>\n      <td>...</td>\n      <td>-68</td>\n      <td>False</td>\n      <td>10</td>\n      <td>True</td>\n      <td>5</td>\n      <td>((11, 13), (17, 17), (21, 21), (25, 25), (27, ...</td>\n      <td>((1233, 1234), (1236, 1237), (1272, 1278), (12...</td>\n      <td>False</td>\n      <td>8</td>\n      <td>(P01116, P21359-6)</td>\n    </tr>\n  </tbody>\n</table>\n<p>192 rows × 119 columns</p>\n</div>"
     },
     "metadata": {},
     "execution_count": 8
    }
   ],
   "source": [
    "# NOTE: 给出目标蛋白的相互作用蛋白\n",
    "%time df3 = demo.pipe_select_he(run_as_completed=True, progress_bar=tqdm).result()\n",
    "# NOTE: df3里的所有结果是该UniProt匹配上的所有异聚体相互作用链；每一行就是一对相互作用 (属于两个不同个蛋白的两条链相互作用)；\n",
    "df3\n",
    "# NOTE: 程序推荐的是i_select_tag为True的，根据一系列打分排名; 同时考虑了interface覆盖范围，尽多选择同一对异聚体相互作用下的各种相互作用结构\n",
    "# NOTE: 输入的UniProt Isoform可能是UniProt_1也可能是UniProt_2\n",
    "# NOTE: 同一对异聚体相互作用可用i_group认定"
   ]
  },
  {
   "source": [
    "> summary: pipe_select_mo是给出目标蛋白的单体信息；pipe_select_ho、pipe_select_he分别给出目标蛋白的同聚体相互作用和异聚体相互作用蛋白相关信息；和mo给出的列信息多数一样但是有两个蛋白\n",
    "\n",
    "## Mapping Between Sites in UniProt Isoform and PDB Chain"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "   UniProt chain_id  entity_id  identity  is_canonical pdb_id struct_asym_id  \\\n",
       "12  P21359        B          1      0.98          True   3p7z              B   \n",
       "13  P21359        B          2      0.94          True   6v6f              B   \n",
       "\n",
       "    pdb_range      unp_range   Entry  ... resolution  \\\n",
       "12  [[5,276]]  [[1566,1837]]  P21359  ...      2.650   \n",
       "13  [[2,329]]  [[1203,1551]]  P21359  ...      2.542   \n",
       "\n",
       "   experimental_method_class  experimental_method  multi_method  \\\n",
       "12                     x-ray    X-ray diffraction         False   \n",
       "13                     x-ray    X-ray diffraction         False   \n",
       "\n",
       "    revision_date deposition_date 1/resolution id_score  select_tag  \\\n",
       "12       20190717        20101013     0.377358      -66        True   \n",
       "13       20200805        20191205     0.393391      -66        True   \n",
       "\n",
       "    select_rank  \n",
       "12            1  \n",
       "13           10  \n",
       "\n",
       "[2 rows x 48 columns]"
      ],
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>UniProt</th>\n      <th>chain_id</th>\n      <th>entity_id</th>\n      <th>identity</th>\n      <th>is_canonical</th>\n      <th>pdb_id</th>\n      <th>struct_asym_id</th>\n      <th>pdb_range</th>\n      <th>unp_range</th>\n      <th>Entry</th>\n      <th>...</th>\n      <th>resolution</th>\n      <th>experimental_method_class</th>\n      <th>experimental_method</th>\n      <th>multi_method</th>\n      <th>revision_date</th>\n      <th>deposition_date</th>\n      <th>1/resolution</th>\n      <th>id_score</th>\n      <th>select_tag</th>\n      <th>select_rank</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>12</th>\n      <td>P21359</td>\n      <td>B</td>\n      <td>1</td>\n      <td>0.98</td>\n      <td>True</td>\n      <td>3p7z</td>\n      <td>B</td>\n      <td>[[5,276]]</td>\n      <td>[[1566,1837]]</td>\n      <td>P21359</td>\n      <td>...</td>\n      <td>2.650</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20190717</td>\n      <td>20101013</td>\n      <td>0.377358</td>\n      <td>-66</td>\n      <td>True</td>\n      <td>1</td>\n    </tr>\n    <tr>\n      <th>13</th>\n      <td>P21359</td>\n      <td>B</td>\n      <td>2</td>\n      <td>0.94</td>\n      <td>True</td>\n      <td>6v6f</td>\n      <td>B</td>\n      <td>[[2,329]]</td>\n      <td>[[1203,1551]]</td>\n      <td>P21359</td>\n      <td>...</td>\n      <td>2.542</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20200805</td>\n      <td>20191205</td>\n      <td>0.393391</td>\n      <td>-66</td>\n      <td>True</td>\n      <td>10</td>\n    </tr>\n  </tbody>\n</table>\n<p>2 rows × 48 columns</p>\n</div>"
     },
     "metadata": {},
     "execution_count": 9
    }
   ],
   "source": [
    "df1[df1.select_tag.eq(True)]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "# NOTE: 选定一个UniProt Isoform与PDB chain的对应关系\n",
    "# NOTE: 在df1的select_tag为true的PDB链里选择覆盖了你的突变的结构; 有没有覆盖你可以看new_unp_range这个区间有没有覆盖你的突变位置\n",
    "record = df1.loc[12]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "UniProt                                                                 P21359\n",
       "chain_id                                                                     B\n",
       "entity_id                                                                    1\n",
       "identity                                                                  0.98\n",
       "is_canonical                                                              True\n",
       "pdb_id                                                                    3p7z\n",
       "struct_asym_id                                                               B\n",
       "pdb_range                                                            [[5,276]]\n",
       "unp_range                                                        [[1566,1837]]\n",
       "Entry                                                                   P21359\n",
       "range_diff                                                                 [0]\n",
       "sifts_range_tag                                                           Safe\n",
       "repeated                                                                 False\n",
       "reversed                                                                 False\n",
       "InDel_sum                                                                    0\n",
       "new_pdb_range                                                        [[5,276]]\n",
       "new_unp_range                                                    [[1566,1837]]\n",
       "conflict_pdb_range                                                 ((44, 44),)\n",
       "unp_len                                                                   2839\n",
       "BINDING_LIGAND_COUNT                                                        16\n",
       "BINDING_LIGAND_INDEX         [[80, 80], [92, 95], [101, 102], [110, 110], [...\n",
       "OBS_COUNT                                                                  270\n",
       "OBS_INDEX                                                           [[7, 276]]\n",
       "OBS_RATIO_SUM                                                           269.18\n",
       "ARTIFACT_INDEX                                                        [[1, 4]]\n",
       "NON_COUNT                                                                    0\n",
       "NON_INDEX                                                                   []\n",
       "SEQRES_COUNT                                                               276\n",
       "STD_COUNT                                                                  276\n",
       "STD_INDEX                                                           [[1, 276]]\n",
       "UNK_COUNT                                                                    0\n",
       "UNK_INDEX                                                                   []\n",
       "ca_p_only                                                                False\n",
       "molecule_type                                                   polypeptide(L)\n",
       "OBS_STD_INDEX                                                      ((7, 276),)\n",
       "OBS_STD_COUNT                                                              270\n",
       "RAW_BS                                                               0.0882066\n",
       "RAW_BS_IG3                                                           0.0938424\n",
       "resolution                                                                2.65\n",
       "experimental_method_class                                                x-ray\n",
       "experimental_method                                          X-ray diffraction\n",
       "multi_method                                                             False\n",
       "revision_date                                                         20190717\n",
       "deposition_date                                                       20101013\n",
       "1/resolution                                                          0.377358\n",
       "id_score                                                                   -66\n",
       "select_tag                                                                True\n",
       "select_rank                                                                  1\n",
       "Name: 12, dtype: object"
      ]
     },
     "metadata": {},
     "execution_count": 11
    }
   ],
   "source": [
    "# show it\n",
    "record"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "output_type": "display_data",
     "data": {
      "text/plain": "<IPython.core.display.HTML object>",
      "text/html": "\n        <style>\n            img.display {\n                -webkit-filter: invert(1);\n                filter: invert(1);\n                }\n        </style>\n    \n        <table align=\"center\">\n            <tr>\n            \n                <td>\n                    <b>Asymmetric unit</b> of 3p7z\n                </td>\n    \n                <td>\n                    <b>Biological assembly 1</b> of 3p7z\n                </td>\n    \n            </tr>\n            <tr>\n            \n                <td>\n                    <img class=\"display\" width=\"300em\" src=\"https://cdn.rcsb.org/images/structures/p7/3p7z/3p7z_model-1.jpeg\"/>\n                </td>\n    \n                <td>\n                    <img class=\"display\" width=\"300em\" src=\"https://cdn.rcsb.org/images/structures/p7/3p7z/3p7z_assembly-1.jpeg\"/>\n                </td>\n    \n            </tr>\n        </table>\n        "
     },
     "metadata": {}
    }
   ],
   "source": [
    "# NOTE: 简单查看结构静态图\n",
    "DisplayPDB(True).show(record['pdb_id'])"
   ]
  },
  {
   "source": [
    "* `unp_residue_number`: residue index in the aspect of the UniProt Isoform's Sequence (Index from 1)\n",
    "* `residue_number`: residue index in the aspect of the PDB Chain's Sequence (SEQRES, Index from 1)\n",
    "* `author_residue_number`: residue index in the aspect of the PDB Chain's Sequence but assigned by the author of this PDB Entry\n",
    "\n",
    "> summary: unp_residue_number就是unp对应序列的从1开始计数的索引位置; resiude_number是对pdb链从1开始计数的索引; author_residue_numer是pdb文件作者定义的索引\n",
    "\n",
    "> 因为unp_residue_number和author_residue_numer可能会不一致，而多数软件需要author_residue_number作为输入，所以要将位点统一转为author_residue_nume再进行使用; (i.e. unp上的12号位在pdb链上不一定是12号)\n",
    "\n",
    "> 一般没法事先知道pdb与unp的标号是否一致，这是一个索引映射的问题；比如unp是100长度，索引就是1,2,..100 而与这个unp匹配上的PDB晶体结构，它的对应匹配上的链是长度为91;作者给这条链的标号是66,67,...156"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "     unp_residue_number  residue_number UniProt author_insertion_code  \\\n",
       "0                  1566               5  P21359                         \n",
       "1                  1567               6  P21359                         \n",
       "2                  1568               7  P21359                         \n",
       "3                  1569               8  P21359                         \n",
       "4                  1570               9  P21359                         \n",
       "..                  ...             ...     ...                   ...   \n",
       "267                1833             272  P21359                         \n",
       "268                1834             273  P21359                         \n",
       "269                1835             274  P21359                         \n",
       "270                1836             275  P21359                         \n",
       "271                1837             276  P21359                         \n",
       "\n",
       "     author_residue_number chain_id  entity_id  multiple_conformers  \\\n",
       "0                     1545        B          1                  NaN   \n",
       "1                     1546        B          1                  NaN   \n",
       "2                     1547        B          1                  NaN   \n",
       "3                     1548        B          1                  NaN   \n",
       "4                     1549        B          1                  NaN   \n",
       "..                     ...      ...        ...                  ...   \n",
       "267                   1812        B          1                  NaN   \n",
       "268                   1813        B          1                  NaN   \n",
       "269                   1814        B          1                  NaN   \n",
       "270                   1815        B          1                  NaN   \n",
       "271                   1816        B          1                  NaN   \n",
       "\n",
       "     observed_ratio pdb_id residue_name struct_asym_id  \n",
       "0               0.0   3p7z          SER              B  \n",
       "1               0.0   3p7z          SER              B  \n",
       "2               1.0   3p7z          LYS              B  \n",
       "3               1.0   3p7z          PHE              B  \n",
       "4               1.0   3p7z          GLU              B  \n",
       "..              ...    ...          ...            ...  \n",
       "267             1.0   3p7z          LEU              B  \n",
       "268             1.0   3p7z          SER              B  \n",
       "269             1.0   3p7z          GLN              B  \n",
       "270             1.0   3p7z          PRO              B  \n",
       "271             1.0   3p7z          ASP              B  \n",
       "\n",
       "[272 rows x 12 columns]"
      ],
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>unp_residue_number</th>\n      <th>residue_number</th>\n      <th>UniProt</th>\n      <th>author_insertion_code</th>\n      <th>author_residue_number</th>\n      <th>chain_id</th>\n      <th>entity_id</th>\n      <th>multiple_conformers</th>\n      <th>observed_ratio</th>\n      <th>pdb_id</th>\n      <th>residue_name</th>\n      <th>struct_asym_id</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>1566</td>\n      <td>5</td>\n      <td>P21359</td>\n      <td></td>\n      <td>1545</td>\n      <td>B</td>\n      <td>1</td>\n      <td>NaN</td>\n      <td>0.0</td>\n      <td>3p7z</td>\n      <td>SER</td>\n      <td>B</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>1567</td>\n      <td>6</td>\n      <td>P21359</td>\n      <td></td>\n      <td>1546</td>\n      <td>B</td>\n      <td>1</td>\n      <td>NaN</td>\n      <td>0.0</td>\n      <td>3p7z</td>\n      <td>SER</td>\n      <td>B</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>1568</td>\n      <td>7</td>\n      <td>P21359</td>\n      <td></td>\n      <td>1547</td>\n      <td>B</td>\n      <td>1</td>\n      <td>NaN</td>\n      <td>1.0</td>\n      <td>3p7z</td>\n      <td>LYS</td>\n      <td>B</td>\n    </tr>\n    <tr>\n      <th>3</th>\n      <td>1569</td>\n      <td>8</td>\n      <td>P21359</td>\n      <td></td>\n      <td>1548</td>\n      <td>B</td>\n      <td>1</td>\n      <td>NaN</td>\n      <td>1.0</td>\n      <td>3p7z</td>\n      <td>PHE</td>\n      <td>B</td>\n    </tr>\n    <tr>\n      <th>4</th>\n      <td>1570</td>\n      <td>9</td>\n      <td>P21359</td>\n      <td></td>\n      <td>1549</td>\n      <td>B</td>\n      <td>1</td>\n      <td>NaN</td>\n      <td>1.0</td>\n      <td>3p7z</td>\n      <td>GLU</td>\n      <td>B</td>\n    </tr>\n    <tr>\n      <th>...</th>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n    </tr>\n    <tr>\n      <th>267</th>\n      <td>1833</td>\n      <td>272</td>\n      <td>P21359</td>\n      <td></td>\n      <td>1812</td>\n      <td>B</td>\n      <td>1</td>\n      <td>NaN</td>\n      <td>1.0</td>\n      <td>3p7z</td>\n      <td>LEU</td>\n      <td>B</td>\n    </tr>\n    <tr>\n      <th>268</th>\n      <td>1834</td>\n      <td>273</td>\n      <td>P21359</td>\n      <td></td>\n      <td>1813</td>\n      <td>B</td>\n      <td>1</td>\n      <td>NaN</td>\n      <td>1.0</td>\n      <td>3p7z</td>\n      <td>SER</td>\n      <td>B</td>\n    </tr>\n    <tr>\n      <th>269</th>\n      <td>1835</td>\n      <td>274</td>\n      <td>P21359</td>\n      <td></td>\n      <td>1814</td>\n      <td>B</td>\n      <td>1</td>\n      <td>NaN</td>\n      <td>1.0</td>\n      <td>3p7z</td>\n      <td>GLN</td>\n      <td>B</td>\n    </tr>\n    <tr>\n      <th>270</th>\n      <td>1836</td>\n      <td>275</td>\n      <td>P21359</td>\n      <td></td>\n      <td>1815</td>\n      <td>B</td>\n      <td>1</td>\n      <td>NaN</td>\n      <td>1.0</td>\n      <td>3p7z</td>\n      <td>PRO</td>\n      <td>B</td>\n    </tr>\n    <tr>\n      <th>271</th>\n      <td>1837</td>\n      <td>276</td>\n      <td>P21359</td>\n      <td></td>\n      <td>1816</td>\n      <td>B</td>\n      <td>1</td>\n      <td>NaN</td>\n      <td>1.0</td>\n      <td>3p7z</td>\n      <td>ASP</td>\n      <td>B</td>\n    </tr>\n  </tbody>\n</table>\n<p>272 rows × 12 columns</p>\n</div>"
     },
     "metadata": {},
     "execution_count": 13
    }
   ],
   "source": [
    "PDB(record['pdb_id']).get_expanded_map_res_df(\n",
    "    record['UniProt'], \n",
    "    record['new_unp_range'], \n",
    "    record['new_pdb_range'], \n",
    "    struct_asym_id=record['struct_asym_id']).result()"
   ]
  },
  {
   "source": [
    "**Example**: 原突变(p.P1836R) $\\rightarrow$ 于3p7z的B链中变为(P1815R) $\\rightarrow$ 作为`foldx`的突变位置输入变为(PB1815R)\n",
    "\n",
    "在这里关注的是1836$\\rightarrow$1815的转变\n",
    "\n",
    "当然同时可以通过`residue_name`列或df1的`conflict_pdb_range`列确认142位是不是氨基酸PRO"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "source": [
    "**WARNING**: 如果你的位点的author_insertion_code非空或者author_residue_number小于等于0, 请联系学委, 须进行额外步骤"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "source": [
    "## Batch Retrieve\n",
    "\n",
    "下面展示如果有多个UniProt Isoform的批量处理方法"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "output_type": "stream",
     "name": "stderr",
     "text": [
      "100%|██████████| 3/3 [00:06<00:00,  2.24s/it]\n"
     ]
    }
   ],
   "source": [
    "# optional\n",
    "res = SIFTSs(('Q00987-2', 'Q00987-10', 'O15350')).fetch('pipe_select_mo').run(tqdm).result()\n",
    "# NOTE: 传入tqdm与否不影响运行\n",
    "# NOTE: res is a list and the order of the result can be different from the original input order"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "    UniProt chain_id  entity_id  identity  is_canonical pdb_id struct_asym_id  \\\n",
       "0    O15350        A          1      0.99          True   3vd2              A   \n",
       "1    O15350        B          1      0.99          True   3vd2              B   \n",
       "2    O15350        C          1      0.99          True   3vd2              C   \n",
       "3    O15350        D          1      0.99          True   3vd2              D   \n",
       "4    O15350        I          1      0.99          True   3vd2              E   \n",
       "..      ...      ...        ...       ...           ...    ...            ...   \n",
       "109  O15350        F          1      0.92          True   5hob              F   \n",
       "110  O15350        G          1      0.92          True   5hob              G   \n",
       "111  O15350        H          1      0.92          True   5hob              H   \n",
       "112  O15350        A          1      0.92          True   5hoc              A   \n",
       "113  O15350        B          1      0.92          True   5hoc              B   \n",
       "\n",
       "      pdb_range    unp_range   Entry  ... resolution  \\\n",
       "0    [[13,210]]  [[115,312]]  O15350  ...   4.000000   \n",
       "1    [[13,210]]  [[115,312]]  O15350  ...   4.000000   \n",
       "2    [[13,210]]  [[115,312]]  O15350  ...   4.000000   \n",
       "3    [[13,210]]  [[115,312]]  O15350  ...   4.000000   \n",
       "4    [[13,210]]  [[115,312]]  O15350  ...   4.000000   \n",
       "..          ...          ...     ...  ...        ...   \n",
       "109    [[3,50]]  [[351,398]]  O15350  ...   1.220013   \n",
       "110    [[3,50]]  [[351,398]]  O15350  ...   1.220013   \n",
       "111    [[3,50]]  [[351,398]]  O15350  ...   1.220013   \n",
       "112    [[3,50]]  [[351,398]]  O15350  ...   1.360078   \n",
       "113    [[3,50]]  [[351,398]]  O15350  ...   1.360078   \n",
       "\n",
       "    experimental_method_class  experimental_method  multi_method  \\\n",
       "0                       x-ray    X-ray diffraction         False   \n",
       "1                       x-ray    X-ray diffraction         False   \n",
       "2                       x-ray    X-ray diffraction         False   \n",
       "3                       x-ray    X-ray diffraction         False   \n",
       "4                       x-ray    X-ray diffraction         False   \n",
       "..                        ...                  ...           ...   \n",
       "109                     x-ray    X-ray diffraction         False   \n",
       "110                     x-ray    X-ray diffraction         False   \n",
       "111                     x-ray    X-ray diffraction         False   \n",
       "112                     x-ray    X-ray diffraction         False   \n",
       "113                     x-ray    X-ray diffraction         False   \n",
       "\n",
       "     revision_date deposition_date 1/resolution id_score  select_tag  \\\n",
       "0         20120725        20120104     0.250000      -65       False   \n",
       "1         20120725        20120104     0.250000      -66       False   \n",
       "2         20120725        20120104     0.250000      -67       False   \n",
       "3         20120725        20120104     0.250000      -68       False   \n",
       "4         20120725        20120104     0.250000      -73       False   \n",
       "..             ...             ...          ...      ...         ...   \n",
       "109       20161019        20160119     0.819663      -70       False   \n",
       "110       20161019        20160119     0.819663      -71       False   \n",
       "111       20161019        20160119     0.819663      -72       False   \n",
       "112       20161130        20160119     0.735252      -65       False   \n",
       "113       20161130        20160119     0.735252      -66       False   \n",
       "\n",
       "     select_rank  \n",
       "0             28  \n",
       "1             36  \n",
       "2             29  \n",
       "3             41  \n",
       "4             30  \n",
       "..           ...  \n",
       "109           64  \n",
       "110           53  \n",
       "111           59  \n",
       "112           60  \n",
       "113           61  \n",
       "\n",
       "[114 rows x 48 columns]"
      ],
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>UniProt</th>\n      <th>chain_id</th>\n      <th>entity_id</th>\n      <th>identity</th>\n      <th>is_canonical</th>\n      <th>pdb_id</th>\n      <th>struct_asym_id</th>\n      <th>pdb_range</th>\n      <th>unp_range</th>\n      <th>Entry</th>\n      <th>...</th>\n      <th>resolution</th>\n      <th>experimental_method_class</th>\n      <th>experimental_method</th>\n      <th>multi_method</th>\n      <th>revision_date</th>\n      <th>deposition_date</th>\n      <th>1/resolution</th>\n      <th>id_score</th>\n      <th>select_tag</th>\n      <th>select_rank</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>O15350</td>\n      <td>A</td>\n      <td>1</td>\n      <td>0.99</td>\n      <td>True</td>\n      <td>3vd2</td>\n      <td>A</td>\n      <td>[[13,210]]</td>\n      <td>[[115,312]]</td>\n      <td>O15350</td>\n      <td>...</td>\n      <td>4.000000</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20120725</td>\n      <td>20120104</td>\n      <td>0.250000</td>\n      <td>-65</td>\n      <td>False</td>\n      <td>28</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>O15350</td>\n      <td>B</td>\n      <td>1</td>\n      <td>0.99</td>\n      <td>True</td>\n      <td>3vd2</td>\n      <td>B</td>\n      <td>[[13,210]]</td>\n      <td>[[115,312]]</td>\n      <td>O15350</td>\n      <td>...</td>\n      <td>4.000000</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20120725</td>\n      <td>20120104</td>\n      <td>0.250000</td>\n      <td>-66</td>\n      <td>False</td>\n      <td>36</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>O15350</td>\n      <td>C</td>\n      <td>1</td>\n      <td>0.99</td>\n      <td>True</td>\n      <td>3vd2</td>\n      <td>C</td>\n      <td>[[13,210]]</td>\n      <td>[[115,312]]</td>\n      <td>O15350</td>\n      <td>...</td>\n      <td>4.000000</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20120725</td>\n      <td>20120104</td>\n      <td>0.250000</td>\n      <td>-67</td>\n      <td>False</td>\n      <td>29</td>\n    </tr>\n    <tr>\n      <th>3</th>\n      <td>O15350</td>\n      <td>D</td>\n      <td>1</td>\n      <td>0.99</td>\n      <td>True</td>\n      <td>3vd2</td>\n      <td>D</td>\n      <td>[[13,210]]</td>\n      <td>[[115,312]]</td>\n      <td>O15350</td>\n      <td>...</td>\n      <td>4.000000</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20120725</td>\n      <td>20120104</td>\n      <td>0.250000</td>\n      <td>-68</td>\n      <td>False</td>\n      <td>41</td>\n    </tr>\n    <tr>\n      <th>4</th>\n      <td>O15350</td>\n      <td>I</td>\n      <td>1</td>\n      <td>0.99</td>\n      <td>True</td>\n      <td>3vd2</td>\n      <td>E</td>\n      <td>[[13,210]]</td>\n      <td>[[115,312]]</td>\n      <td>O15350</td>\n      <td>...</td>\n      <td>4.000000</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20120725</td>\n      <td>20120104</td>\n      <td>0.250000</td>\n      <td>-73</td>\n      <td>False</td>\n      <td>30</td>\n    </tr>\n    <tr>\n      <th>...</th>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n    </tr>\n    <tr>\n      <th>109</th>\n      <td>O15350</td>\n      <td>F</td>\n      <td>1</td>\n      <td>0.92</td>\n      <td>True</td>\n      <td>5hob</td>\n      <td>F</td>\n      <td>[[3,50]]</td>\n      <td>[[351,398]]</td>\n      <td>O15350</td>\n      <td>...</td>\n      <td>1.220013</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20161019</td>\n      <td>20160119</td>\n      <td>0.819663</td>\n      <td>-70</td>\n      <td>False</td>\n      <td>64</td>\n    </tr>\n    <tr>\n      <th>110</th>\n      <td>O15350</td>\n      <td>G</td>\n      <td>1</td>\n      <td>0.92</td>\n      <td>True</td>\n      <td>5hob</td>\n      <td>G</td>\n      <td>[[3,50]]</td>\n      <td>[[351,398]]</td>\n      <td>O15350</td>\n      <td>...</td>\n      <td>1.220013</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20161019</td>\n      <td>20160119</td>\n      <td>0.819663</td>\n      <td>-71</td>\n      <td>False</td>\n      <td>53</td>\n    </tr>\n    <tr>\n      <th>111</th>\n      <td>O15350</td>\n      <td>H</td>\n      <td>1</td>\n      <td>0.92</td>\n      <td>True</td>\n      <td>5hob</td>\n      <td>H</td>\n      <td>[[3,50]]</td>\n      <td>[[351,398]]</td>\n      <td>O15350</td>\n      <td>...</td>\n      <td>1.220013</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20161019</td>\n      <td>20160119</td>\n      <td>0.819663</td>\n      <td>-72</td>\n      <td>False</td>\n      <td>59</td>\n    </tr>\n    <tr>\n      <th>112</th>\n      <td>O15350</td>\n      <td>A</td>\n      <td>1</td>\n      <td>0.92</td>\n      <td>True</td>\n      <td>5hoc</td>\n      <td>A</td>\n      <td>[[3,50]]</td>\n      <td>[[351,398]]</td>\n      <td>O15350</td>\n      <td>...</td>\n      <td>1.360078</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20161130</td>\n      <td>20160119</td>\n      <td>0.735252</td>\n      <td>-65</td>\n      <td>False</td>\n      <td>60</td>\n    </tr>\n    <tr>\n      <th>113</th>\n      <td>O15350</td>\n      <td>B</td>\n      <td>1</td>\n      <td>0.92</td>\n      <td>True</td>\n      <td>5hoc</td>\n      <td>B</td>\n      <td>[[3,50]]</td>\n      <td>[[351,398]]</td>\n      <td>O15350</td>\n      <td>...</td>\n      <td>1.360078</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20161130</td>\n      <td>20160119</td>\n      <td>0.735252</td>\n      <td>-66</td>\n      <td>False</td>\n      <td>61</td>\n    </tr>\n  </tbody>\n</table>\n<p>114 rows × 48 columns</p>\n</div>"
     },
     "metadata": {},
     "execution_count": 15
    }
   ],
   "source": [
    "res[0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "      UniProt chain_id  entity_id  identity  is_canonical pdb_id  \\\n",
       "0    Q00987-2        A          1      1.00         False   4jv9   \n",
       "1    Q00987-2        C          3      0.97         False   5mnj   \n",
       "2    Q00987-2        G          3      0.97         False   5mnj   \n",
       "3    Q00987-2        A          1      1.00         False   4ogv   \n",
       "4    Q00987-2        B          1      1.00         False   4ogv   \n",
       "..        ...      ...        ...       ...           ...    ...   \n",
       "208  Q00987-2        A          1      1.00         False   6q9o   \n",
       "209  Q00987-2        B          1      1.00         False   6q9o   \n",
       "210  Q00987-2        A          1      1.00         False   6q9h   \n",
       "211  Q00987-2        D          2      1.00         False   3mqs   \n",
       "212  Q00987-2        B          2      1.00         False   5wts   \n",
       "\n",
       "    struct_asym_id  pdb_range    unp_range   Entry  ... resolution  \\\n",
       "0                A   [[1,10]]    [[18,27]]  Q00987  ...      2.500   \n",
       "1                C  [[23,86]]  [[233,296]]  Q00987  ...      2.160   \n",
       "2                G  [[23,86]]  [[233,296]]  Q00987  ...      2.160   \n",
       "3                A   [[1,11]]    [[17,27]]  Q00987  ...      2.197   \n",
       "4                B   [[1,11]]    [[17,27]]  Q00987  ...      2.197   \n",
       "..             ...        ...          ...     ...  ...        ...   \n",
       "208              A   [[2,12]]    [[17,27]]  Q00987  ...      1.210   \n",
       "209              B   [[2,12]]    [[17,27]]  Q00987  ...      1.210   \n",
       "210              A   [[2,12]]    [[17,27]]  Q00987  ...      2.000   \n",
       "211              B   [[1,10]]  [[199,208]]  Q00987  ...      2.400   \n",
       "212              B   [[3,24]]     [[6,27]]  Q00987  ...      3.004   \n",
       "\n",
       "    experimental_method_class  experimental_method  multi_method  \\\n",
       "0                       x-ray    X-ray diffraction         False   \n",
       "1                       x-ray    X-ray diffraction         False   \n",
       "2                       x-ray    X-ray diffraction         False   \n",
       "3                       x-ray    X-ray diffraction         False   \n",
       "4                       x-ray    X-ray diffraction         False   \n",
       "..                        ...                  ...           ...   \n",
       "208                     x-ray    X-ray diffraction         False   \n",
       "209                     x-ray    X-ray diffraction         False   \n",
       "210                     x-ray    X-ray diffraction         False   \n",
       "211                     x-ray    X-ray diffraction         False   \n",
       "212                     x-ray    X-ray diffraction         False   \n",
       "\n",
       "     revision_date deposition_date 1/resolution id_score  select_tag  \\\n",
       "0         20130605        20130325     0.400000      -65       False   \n",
       "1         20191016        20161213     0.462963      -67       False   \n",
       "2         20191016        20161213     0.462963      -71       False   \n",
       "3         20140618        20140116     0.455166      -65       False   \n",
       "4         20140618        20140116     0.455166      -66       False   \n",
       "..             ...             ...          ...      ...         ...   \n",
       "208       20190724        20181218     0.826446      -65       False   \n",
       "209       20190724        20181218     0.826446      -66       False   \n",
       "210       20190724        20181218     0.500000      -65       False   \n",
       "211       20110713        20100428     0.416667      -68       False   \n",
       "212       20191106        20161214     0.332889      -66       False   \n",
       "\n",
       "     select_rank  \n",
       "0             -1  \n",
       "1             12  \n",
       "2             11  \n",
       "3             -1  \n",
       "4             -1  \n",
       "..           ...  \n",
       "208           -1  \n",
       "209           -1  \n",
       "210           -1  \n",
       "211           14  \n",
       "212           -1  \n",
       "\n",
       "[213 rows x 48 columns]"
      ],
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>UniProt</th>\n      <th>chain_id</th>\n      <th>entity_id</th>\n      <th>identity</th>\n      <th>is_canonical</th>\n      <th>pdb_id</th>\n      <th>struct_asym_id</th>\n      <th>pdb_range</th>\n      <th>unp_range</th>\n      <th>Entry</th>\n      <th>...</th>\n      <th>resolution</th>\n      <th>experimental_method_class</th>\n      <th>experimental_method</th>\n      <th>multi_method</th>\n      <th>revision_date</th>\n      <th>deposition_date</th>\n      <th>1/resolution</th>\n      <th>id_score</th>\n      <th>select_tag</th>\n      <th>select_rank</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>Q00987-2</td>\n      <td>A</td>\n      <td>1</td>\n      <td>1.00</td>\n      <td>False</td>\n      <td>4jv9</td>\n      <td>A</td>\n      <td>[[1,10]]</td>\n      <td>[[18,27]]</td>\n      <td>Q00987</td>\n      <td>...</td>\n      <td>2.500</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20130605</td>\n      <td>20130325</td>\n      <td>0.400000</td>\n      <td>-65</td>\n      <td>False</td>\n      <td>-1</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>Q00987-2</td>\n      <td>C</td>\n      <td>3</td>\n      <td>0.97</td>\n      <td>False</td>\n      <td>5mnj</td>\n      <td>C</td>\n      <td>[[23,86]]</td>\n      <td>[[233,296]]</td>\n      <td>Q00987</td>\n      <td>...</td>\n      <td>2.160</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20191016</td>\n      <td>20161213</td>\n      <td>0.462963</td>\n      <td>-67</td>\n      <td>False</td>\n      <td>12</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>Q00987-2</td>\n      <td>G</td>\n      <td>3</td>\n      <td>0.97</td>\n      <td>False</td>\n      <td>5mnj</td>\n      <td>G</td>\n      <td>[[23,86]]</td>\n      <td>[[233,296]]</td>\n      <td>Q00987</td>\n      <td>...</td>\n      <td>2.160</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20191016</td>\n      <td>20161213</td>\n      <td>0.462963</td>\n      <td>-71</td>\n      <td>False</td>\n      <td>11</td>\n    </tr>\n    <tr>\n      <th>3</th>\n      <td>Q00987-2</td>\n      <td>A</td>\n      <td>1</td>\n      <td>1.00</td>\n      <td>False</td>\n      <td>4ogv</td>\n      <td>A</td>\n      <td>[[1,11]]</td>\n      <td>[[17,27]]</td>\n      <td>Q00987</td>\n      <td>...</td>\n      <td>2.197</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20140618</td>\n      <td>20140116</td>\n      <td>0.455166</td>\n      <td>-65</td>\n      <td>False</td>\n      <td>-1</td>\n    </tr>\n    <tr>\n      <th>4</th>\n      <td>Q00987-2</td>\n      <td>B</td>\n      <td>1</td>\n      <td>1.00</td>\n      <td>False</td>\n      <td>4ogv</td>\n      <td>B</td>\n      <td>[[1,11]]</td>\n      <td>[[17,27]]</td>\n      <td>Q00987</td>\n      <td>...</td>\n      <td>2.197</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20140618</td>\n      <td>20140116</td>\n      <td>0.455166</td>\n      <td>-66</td>\n      <td>False</td>\n      <td>-1</td>\n    </tr>\n    <tr>\n      <th>...</th>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n    </tr>\n    <tr>\n      <th>208</th>\n      <td>Q00987-2</td>\n      <td>A</td>\n      <td>1</td>\n      <td>1.00</td>\n      <td>False</td>\n      <td>6q9o</td>\n      <td>A</td>\n      <td>[[2,12]]</td>\n      <td>[[17,27]]</td>\n      <td>Q00987</td>\n      <td>...</td>\n      <td>1.210</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20190724</td>\n      <td>20181218</td>\n      <td>0.826446</td>\n      <td>-65</td>\n      <td>False</td>\n      <td>-1</td>\n    </tr>\n    <tr>\n      <th>209</th>\n      <td>Q00987-2</td>\n      <td>B</td>\n      <td>1</td>\n      <td>1.00</td>\n      <td>False</td>\n      <td>6q9o</td>\n      <td>B</td>\n      <td>[[2,12]]</td>\n      <td>[[17,27]]</td>\n      <td>Q00987</td>\n      <td>...</td>\n      <td>1.210</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20190724</td>\n      <td>20181218</td>\n      <td>0.826446</td>\n      <td>-66</td>\n      <td>False</td>\n      <td>-1</td>\n    </tr>\n    <tr>\n      <th>210</th>\n      <td>Q00987-2</td>\n      <td>A</td>\n      <td>1</td>\n      <td>1.00</td>\n      <td>False</td>\n      <td>6q9h</td>\n      <td>A</td>\n      <td>[[2,12]]</td>\n      <td>[[17,27]]</td>\n      <td>Q00987</td>\n      <td>...</td>\n      <td>2.000</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20190724</td>\n      <td>20181218</td>\n      <td>0.500000</td>\n      <td>-65</td>\n      <td>False</td>\n      <td>-1</td>\n    </tr>\n    <tr>\n      <th>211</th>\n      <td>Q00987-2</td>\n      <td>D</td>\n      <td>2</td>\n      <td>1.00</td>\n      <td>False</td>\n      <td>3mqs</td>\n      <td>B</td>\n      <td>[[1,10]]</td>\n      <td>[[199,208]]</td>\n      <td>Q00987</td>\n      <td>...</td>\n      <td>2.400</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20110713</td>\n      <td>20100428</td>\n      <td>0.416667</td>\n      <td>-68</td>\n      <td>False</td>\n      <td>14</td>\n    </tr>\n    <tr>\n      <th>212</th>\n      <td>Q00987-2</td>\n      <td>B</td>\n      <td>2</td>\n      <td>1.00</td>\n      <td>False</td>\n      <td>5wts</td>\n      <td>B</td>\n      <td>[[3,24]]</td>\n      <td>[[6,27]]</td>\n      <td>Q00987</td>\n      <td>...</td>\n      <td>3.004</td>\n      <td>x-ray</td>\n      <td>X-ray diffraction</td>\n      <td>False</td>\n      <td>20191106</td>\n      <td>20161214</td>\n      <td>0.332889</td>\n      <td>-66</td>\n      <td>False</td>\n      <td>-1</td>\n    </tr>\n  </tbody>\n</table>\n<p>213 rows × 48 columns</p>\n</div>"
     },
     "metadata": {},
     "execution_count": 16
    }
   ],
   "source": [
    "res[1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ]
}